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Abstract 

We study the optimal transport between two probability measures 
on the real line, where the transport plans are laws of one-step mar¬ 
tingales. A quasi-sure formulation of the dual problem is introduced 
and shown to yield a complete duality theory for general marginals 
and measurable reward (cost) functions: absence of a duality gap and 
existence of dual optimizers. Both properties are shown to fail in the 
classical formulation. As a consequence of the duality result, we obtain 
a general principle of cyclical monotonicity describing the geometry of 
optimal transports. 
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1 Introduction 

Let fi, V be probability measures on the real line M. A Monge-Kantorovich 
transport from /x to u is a probability P on whose marginals are fi and w, 
respectively; that is, if {X, Y) is the identity map on M^, then fi = P o X~^ 
is the distribution of X under P and similarly u = P o Y~^. The set of all 
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these transports is denoted by n(/r, i^). Let P G n(/i, z^) and consider the 
disintegration P = n. If the stochastic kernel k,{x, dy) = P\-\X = x\ is 
given by the Dirac mass for ^ map T : M —>■ M, then T is called the 
corresponding Monge transport. In general, a Monge-Kantorovich transport 
may be interpreted as a randomized Monge transport. 

Let / be a (measurable) real function on then the cumulative reward 
for transporting y to u according to P is 

P{f) = E^[f{X,Y)]= [ fix,y)P{dx,dy) 

and the Monge-Kantorovich optimal transport problem is given by 

sup Pif). (1.1) 

Pen{fj.,u) 

In an alternate interpretation, the negative of / is seen as a cost and the 
above is the minimization of the cumulative cost. One advantage of the 
Monge-Kantorovich formulation is that an optimizer P G n(/i, p) exists as 
soon as / is upper semicontinuous and sufficiently integrable (of course, ex¬ 
istence may fail when / is merely measurable). Optimal transport has been 
a very active field in the last several decades; we refer to Villani’s mono¬ 
graphs gulls] or the lecture notes by Ambrosio and Gigli |2] for background. 

In the so-called martingale optimal transport problem, we only consider 
transports which are martingale laws; then y can be seen as the distribution 
of a martingale at time t = 0 and p as the distribution of the process at 
t = 1. This problem was introduced by Beiglbock, Henry-Labordere and 
Penkner [5] in the discrete-time case and by Galichon, Henry-Labordere and 
Touzi |23) in continuous time. In the present paper, we focus on the most 
fundamental case, where the transport takes place in a single time step. 
That is, a martingale transport from /r to z/ is a law P G n(;U, p) under 
which {X, Y) is a martingale; of course, this necessitates that y and p have 
finite first moments. We let 

M{y,p) = {Pell{y,p) : E^[Y\X] = X P-a.s.] 

denote the set of martingale transports. Alternately, consider a disintegra¬ 
tion P = y0K of P€ n(/r, p)] then P is a martingale transport if and only 
if X is the barycenter (mean) of k{x) for y-a.e. x G M; that is, J y k{x, dy) = x. 
Here we may also observe that Monge transports are meaningless in this 
context—only a constant martingale is deterministic. 

The martingale property induces an asymmetry between y and p —the 
marginals can only become more dispersed over time. More precisely, the 
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set A^(/i, z^) is nonempty if and only if /i, are in convex order, denoted 
/i <c i', meaning that < i^i4>) whenever (/> is a convex function (see 

Proposition 2.1). Under this condition, the martingale optimal transport 


problem is given by 


sup Pif). 


( 1 . 2 ) 


The present paper develops a complete duality theory for this problem, for 
general reward functions and marginals. In particular, we obtain existence 
in the dual problem, and that is the main goal of this paper. 


The problem (1.2) was first studied in [a ED]. In analogy to the Hoeffding- 
Frechet coupling of classical transport, [7] establishes a measure P, the so- 


called Left-Curtain Coupling, that is optimal in (1.2) for reward functions / 
of a specific form. This form was generalized to a version of the Spence- 
Mirrlees condition in |26) . where the coupling is also described more explic¬ 
itly, whereas |31| shows the stability with respect to the marginals. On the 
other hand, |29[ l30] find the optimal transports for f{x,y) = ±|a: — y\. A 


generalization of the martingale transport problem, where an arbitrary linear 
constraint is imposed on n(/i, z/), is studied in |43) . 

Martingale optimal transport is motivated by considerations of model un¬ 
certainty in financial mathematics. Starting with a stream of literature 
studies robust bounds for option prices via the Skorokhod embedding prob¬ 
lem and this can be interpreted as optimal transport in continuous time; cf. 
ISHIEZ] for surveys. The opposite direction is taken in |3], where Skorokhod 
embeddings are studied from an optimal transport point of view. Recently, a 
rich literature has emerged around the topics of model robustness and trans¬ 
port; see, e.g., [U El ESI EH ESI ESI Ell ES] for models in discrete time and 
milllllslIITlEniEIlElESlEllElESlESlISD] for continuous-time models. 


1.1 Duality for Classical Transport 


Let us first recall the duality results for the classical case (1.1). Indeed, the 
dual problem is given by 


inf + 


subject to ^p{x)+^p{y) > /(x,y), {x,y) G M^. (1.3) 


Here ip G and z/^ G T^(z/) are real functions that can be seen as La¬ 


grange multipliers for the marginal constraints in (1.1). There are two funda¬ 


mental results on this duality in a general setting, obtained by Kellerer |33) . 


First, there is no duality gap; i.e., the values of (1.1) and (1.3) coincide 


3 









Second, there exists an optimizer G x for the dual prob¬ 

lem, whenever its value (1.3) is finite. While additional regularity assump¬ 
tions allow for easier proofs, the results of |33) apply to any Borel function 
/ : —>■ [0,oo]. An important application is the “Fundamental Theorem 

of Optimal Transport” la sa or “Monotonicity Principle” which describes 
the trajectories used by optimal transports: there exists a set T C such 
that a given transport P G n(/x, z/) is optimal for (1.1) if and only if P is 
concentrated on T. This set can be obtained directly from a dual optimizer 
(ip, Ip) by setting 


T = {(x, y) G : (fix) -h ^/>(y) = f{x, y)}. (1.4) 

In fact, given ip, one can find ip by /-concave conjugation and vice versa, 
so that either of the functions may be called the Kantorovich potential of 
the problem, and then T is the graph of its /-subdifferential. The set T has 
an important property called /-cyclical monotonicity which can be used to 
analyze the geometry of optimal transports; we refer to (a iia for further 
background. 


1.2 Duality for Martingale Transport 


Let us now move on to the dual problem in the case of interest, where 
the martingale constraint gives rise to an additional Lagrange multiplier. 
Formally, E^[Y\X] = X is equivalent to E^[h{X){Y — X)] = 0 for all 


functions h and thus the domain of the analogue of (1.3) consists of triplets 
{p>, Ip, h) of real functions such that 


(p{x) + ip{y) + h{x){y - x) > f{x, y), {x, y) G 

while the dual cost function is unchanged. 


(1.5) 


inf {l^(¥?) +i^(V’)}- 

ip,ip,h 

In |5j, it was shown that there is no duality gap whenever the reward func¬ 
tion / is upper semicontinuous and satisfies a linear growth condition, and 
the analogous result holds in the setting of |43| . On the other hand, a 
counterexample in j5] showed that the dual problem may fail to admit an 
optimizer, even if / is bounded continuous and the marginals are compactly 
supported. 

The proofs of the positive results in 0 03], absence of a duality gap, 
reduce to classical transport theory by dualizing the martingale constraint 
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and using a minimax argument. Only the latter step requires upper semicon¬ 
tinuity, and it is easy to believe that it is a technical condition necessitated 
only by the technique of proof. This turns out to be wrong: we provide 


a counterexample (Example 8.1) showing that the dual problem (1.5) can 


produce a duality gap in a fairly tame setting with compactly supported 
marginals and a reward function that is bounded and lower semicontinuous. 
Regarding the absence of an optimizer, we provide a counterexample (Exam¬ 


ple 8.2) which is, in a sense to be made specific, simpler than the one in |5] 


and suggests that failure of existence is generic as soon as the marginals do 
not satisfy a condition called irreducibility (see below and Section and / 
is not smooth. 

Let us now introduce a formulation of the dual problem which will al¬ 
low us to overcome both issues and develop a complete duality theory—dual 
existence and no duality gap—for general reward functions. The most im¬ 


portant novelty is that we shall reformulate the pointwise inequality of (1.5) 
in a quasi-sure way. Indeed, we say that a property holds Ai{^,iy)-quasi- 
surely (q.s. for short) if it holds outside a Af (^, i^)-polar set; that is, a set 
which is P-null for all P £ Al(^, We then replace ( |1.5[ ) by 

ipiX)+i;{Y) + hiX){Y-X)>fiX,Y) M{fi,u)-q.s- (1.6) 

i.e., the inequality holds P-a.s. for all P £ Af(//, i^). For the classical trans¬ 
port, it is known that all polar sets are of a trivial type—they are negligible 
for one of the marginals. This is different in the martingale case. Indeed, as 
observed in [3, there are obstacles that cannot be crossed by any martingale 
transport. These barriers divide the real line into intervals that (almost) 
do not interact and are therefore called irreducible components. Our first 


important result (Theorem 3.2) provides a complete characterization of the 
A1 (^, i/)-polar sets: a subset of is polar if and only if it consists of tra¬ 
jectories a) crossing a barrier or b) negligible for one of the marginals. On 


the strength of this result, we have a rather precise understanding of (1.6); 


namely, it represents a pointwise inequality on each irreducible component, 
modulo sets that are not seen by the marginals. 

We thus proceed by first studying an irreducible component; the analy¬ 
sis has two parts. On the one hand, there are soft arguments of separation 
(Hahn-Banach) and extension (Choquet theory) that are familiar from clas¬ 
sical transport theory. On the other hand, there is an important closedness 


result (Proposition 5.2) based on novel arguments: given reward functions 
/„—>■/ and corresponding almost-optimal dual elements {(pn,^n,hn), we 
construct a limit {ip, 'll:, h) for /. The proof of this result is deeply linked to 
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the convex order of the marginals. Indeed, we introduce concave functions Xn 
which control in the sense of one-sided bounds. A compactness re¬ 

sult of Arzela-Ascoli type is established for the sequence {Xn), based on a 
bound of the form 

0< Jxnd{n-u)<C. (1.7) 

After finding a limit x for Xn, we can produce limits for {ipn,ipn) 

by Komlos-type arguments, and the corresponding function h can be found 
in an a posteriori fashion. The compactness result yields some insight into 
the failure of the pointwise formulation (1.5) for the global problem: the 


bound (1.7) does not control the concavity of Xn at barriers because the 
inequality between n and v is not “strict” in the convex order at these points. 

A second relaxation is necessary to obtain our duality result; namely, 
the cost -|- needs to be defined in an extended sense. We provide 
counterexamples showing that the existence of dual optimizers (Example |8.4[ ) 
and in some cases the absence of a duality gap (Example |8.5[ ) break down if 
one insists on (p and being integrable for fj, and v, individually. We shall 
see that several natural definitions of p{(p) + lead to the same value. 

With these notions in place, our main result (Theorem |7.4[ ) is that duality 
holds for arbitrary Borel reward functions / : —)• [0, oo]; here the lower 


bound can be relaxed easily (Remark 7.5) but not eliminated completely 


(Example |8.6[ ). Moreover, existence holds in the dual problem whenever it is 
finite. As a consequence, we derive a monotonicity principle (Corollary |7.8[ ) 
with a set analogous to (1.4) in a fairly definitive form, generalizing and 


simplifying results of HI Ha]. 

While there are no previous results on duality for irregular reward func¬ 
tions, we mention that the proof of the monotonicity principle in [7j con¬ 
tains elements of a theory for dual optimizers for the case of continuous /, 
although the dual problem as such is not formalized in |7]. We expect that 
the quasi-sure formulation proposed in the present paper will prove to be a 
useful framework not only for the situation at hand but for a large class of 
transport problems; in particular, to obtain dual attainment under general 
conditions. 

The remainder of the paper is organized as follows. In Sectionwe recall 
preliminaries on the convex order and potential functions. The structure of 
A4(/i, i^)-polar sets is characterized in Section]^ and Sectiondiscusses the 
extended definition of + The crucial closedness result for the dual 

problem is obtained in Section which allows us to establish the duality 
on an irreducible component in Section Section combines the previous 
results to obtain the global duality theorem and the monotonicity principle. 
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The counterexamples are collected in the concluding Section 

2 Preliminaries on the Convex Order 

It will be useful to consider finite measures, not necessarily normalized to be 
probabilities. The notions introduced in Section [^extend in an obvious way. 
Let fj,, V be finite measures on M with finite hrst moment. We say that /x 
and V are in convex order, denoted ^ <c v, if < ’^{4’) for ^-riy convex 
function (/> : M —)• M. It then follows that ^ and v have the same total mass 
and the same barycenter. An alternative characterization of this order refers 
to the so-called potential function, defined by 

: M —)■ M, u^(x) := J\t — x\fi{dt). 

This is a nonnegative convex function with a minimum at the median of /x, 
and /X can be recovered from via the second derivative measure. The 
following result is known; the nontrivial part is |39[ Theorem 8]. 

Proposition 2.1. Suppose that /x(M) = ix(M). The following are equivalent: 

(i) The measures /x and v are in convex order: /x <c 

(a) The potential functions of pL and v are ordered: u^<Uu. 

(Hi) There exists a martingale transport from /x to v: ^ 0. 

It will be important to distinguish the intervals where < u^, from 
the points where the potential functions touch, because such points act as 
barriers for martingale transports. In all that follows, the statement /U <c xx 
implicitly means that /x, v are finite measures on M with finite hrst moment. 

Definition 2.2. The pair /X <c xx is irreducible if the set I = {?x^ < u,^} is 
connected and /x(/) = /x(M). In this situation, let J be the union of I and 
any endpoints of I that are atoms of xx; then (/, J) is the domain of (^, xx). 

As = Uy outside of I and /x(/) = ^(M) and p,, xx have the same 
mass and mean, the measure xx is concentrated on J. More precisely, the 
open interval I is the interior of the convex hull of the support of xx, and 
J is the minimal superset of I supporting xx. The marginals /U <c xx can be 
decomposed into irreducible components as follows; cf. [3 Theorem 8.4]. 
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Proposition 2.3. Let fi <c and let {Ik)i<k<N be the (open) eomponents 
of {u^ < Uy}, where G {0, 1, ..., oo}. Set Iq=R\ VJk>ih and //*. = fi\i^ 
for k >0, so that fj, = '^k>oLk- Then, there exists a unique deeomposition 
V = X]fc>o sueh that 

ho = k'o and pLk <c for all k > 1, 

and this deeomposition satisfies Ik = {u^f, < Uu,.} for all k > 1. Moreover, 
any P G Ai{p,,v) admits a unique deeomposition P = '^^.y^Pk sueh that 
Pk G Mihk, k'k) for all k>0. 

The index 0 is special in the above: the measure Pq is the unique mar¬ 
tingale transport from qiQ to itself, given by the law of x i—)■ {x,x) un¬ 
der fiQ. This corresponds to a constant martingale or the identical Monge 
transport. In particular, Pq does not depend on P G We ob¬ 

serve that Pq is concentrated on Aq := A n /q, the part of the diagonal 
A = {(x,x) G : X G M} which is not contained in any of the squares 
Ik X Jk for A: > 1. Thus, Aq will play a role similar to Ik x Jk for A: = 0. 

A second remark is that both of the families {hk)k>o {Pk)k>o &re 
mutually singular, whereas {vk)k>o need not be. Indeed, an atom of v may 
be split such as to contribute to two adjacent components Vk- 
We close this section with a technical remark for later use. 

Remark 2.4. We observe from the definition that the continuous convex 
function is affine to the left and to the right of the support of p., with 
absolute slope equal to the mass of p.. Moreover, discontinuities of the first 
derivative correspond to atoms of p,. 

Let h —c k'he irreducible with domain {I,J) and write I = {l,r). As 
/i(/) = /i(M), the measure fx cannot have atoms at the boundary points 
of I. Suppose that r < oo; then the derivative du^{r) exists and is equal 
to ;u(M). However, the measure v may have an atom at r, and while the right 
derivative d'^Ui,{r) is always equal to dUfj,{r), the left derivative satisfies 

dUf,{r) - d~Uu{r) = 2v{{r]). 

Similarly, if I > —oo, we have d~Uy{l) = dUfj,{l) = —/u(M) and 

d'^Uu{l) - du^{l) = 2v{{l}). 

3 The Structure of p)-Polar Sets 

The goal of this section is to characterize the sets which cannot be charged 
by any martingale transport. Given a collection V of measures on some 
space a set P C 0 is called V-polar if is it P-null for every P gV. 



For the classical mass transport, the following result can be obtained by 
applying Kellerer’s duality theorem |33| to the indicator function f = 1 b', 
cf. m Proposition 2.1], 

Proposition 3.1. Let /r, v be finite measures of the same total mass and let 
B be a Borel set. Then B is n(/U, v)-polar if and only if there exist a 

H-nullset and a v-nullset N^, such that 

B C (Nfj_ X M) U (M X N,y). 

The above result, which holds true more generally for arbitrary Polish 
spaces, states that the only n(p, i^)-polar sets are the obvious ones: the 
sets which are not seen by the marginals. This is the reason why in the 
classical dual transport problem, there is no difference between a quasi-sure 
formulation and a pointwise formulation. Namely, if <p{X) +^(y) > / holds 
n(^, z^)-q.s., let B be the exceptional set and let be as above. Then 

setting (/? = oo on and = oo on N^, yields ^{X) ip(Y) > f pointwise 
on without changing the cost 

The situation is fundamentally different for the martingale transport. 
Unless fi <c i' is irreducible, there are obstructions to all martingale trans¬ 
ports, and more precisely, a set that “fails to be on a component” is polar, 
even if it is seen by the marginals. The following result completely describes 
the structure of Ad(/U, z/)-polar sets. 


Figure 1: In this illustration of Theorem 3.2, the striped areas correspond 
to the domains of two irreducible components. The dotted areas are polar 
even though they are not negligible for the marginals. 
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Theorem 3.2. Let fi <c and let B C be a Borel set. Then B is 
M{^,v)-polar if and only if there exist a pL-nullset and a u-nullset N^, 
such that 


B C [N^ X M) U (M X Nt,) U 


Au IJ 4 

k>l 


X 4 


where A = {(x,x) G : x G M} is the diagonal. 

The main step in the proof is the following construction. 

Lemma 3.3. Let pL <c y be irreducible and let -k be a finite measure on 
whose marginals 7ri,7r2 satisf'^'Ki < fj, and 1^2 A y- Then, there exists 
P € y) such that P dominates vr in the sense of absolute continuity. 

Proof. Let {I,J) be the domain of {ix,y). We may assume that n^y are 
probability measures; in particular, / 7^ 0. 

(i) We first show the result under the additional hypothesis that vr is 
supported on an compact rectangle K x L T I x J. 

Writing L = {l,r), the definition of (I, J) implies that y assigns positive 
mass to any neighborhood of 1. Since K is compact, it has positive distance 
to I and we can find a compact set B- C J with y{B-) > 0 to the left of K] 
i.e., I < y < X for all y G B- and x € K. Similarly, we can find a compact 
i3_i_ C J with positive mass to the right of K. Let 


vr = TTi (8) K 

be a disintegration of vr; we may choose a version of the kernel k(x, dy) that 
is concentrated on L for all x € K. We shall now change the mean of k{x) 
such as to render it a martingale kernel. Indeed, let us introduce a kernel k' 
of the form 

K{x,dy) + s-{x)y{dy)\B_ + s+{x)y{dy)\B+ ^ ^ 

K (^x, ay) — ■' " ■ ■ ” ■) X G A. 

c(xj 

Here c(x) > 1 is the normalizing constant such that k'(x, dy) is a stochastic 
kernel. Moreover, for x such that the mean of At(x) is smaller or equal to x, 
we set s_(x) := 0 and define s+(x) as the unique nonnegative scalar such 
that the mean of k'{x) equals x, and analogously in the opposite case. Note 

^By ni < fi we mean that 7ri(T) < /i(T) for every Borel set d C R. 
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that s± are well-defined because B± is at a positive distance to the left (resp. 
right) of X E K. Then, 


TT 


/ 




is a martingale measure with tt' ;§> vr and its marginals satisfy < 

TTi < ^ as well as v' < u] the latter is due to vri(M) < /u(M) = 1 and 


k'{x)<v{B^) ^v\b_+v{B+) ^i'\b++>^{x) 


and 


'Ki®v\b_ + t^i®i'\b+ + vTi ( g )«; < 3 z ^. 


We also note that 


tt' is concentrated on a compact square K x L' (3-1) 

where L' C J is the convex set generated by B- and It remains to hnd 
P G A4{fi, v) such that P ^ P. 

(a) We hrst consider the case where I = J. Since is continuous 

and strictly positive on /, this difference is uniformly bounded away from 
zero on the compact set L' C I. On the other hand, the continuous function 
Uv' — is uniformly bounded on L'. Hence, there is 0 < e < 1 such that 

Ufj, — < Uu — euy! on L', 


but then this inequality extends to the whole of 
of L', due to (3.1). Noting also that 

/r — en' <cV — , 


because = u,y/ outside 


we thus have 


and these are nonnegative measures due to fj,' < ^ and P < v. Hence, 
M{fi — e/i', — eP) is nonempty; cf. Proposition 2.1 Let tTs be any element 

of that set and dehne 

P = £fi'(S.)~^P + TTg. 


By construction, P is an element of v) and P ^ P ^ n. 

(b) Next, we discuss the case where i' has an atom at one or both of 
the endpoints of /. Suppose that z^({r}) > 0; then L' may touch the right 
boundary of J and we need to give a different argument for the existence of 
e > 0 as above, since Ui, — need no longer be bounded away from zero on 
L'. However, the left derivatives satisfy d~Uy{r) < d~u^{r) by Remark 
and similarly at I if > 0. Recalling that the derivatives of any potential 


2.4 
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function—and in particular of u^> and Ui ,/—are uniformly bounded by the 
total mass of the corresponding measure, we see that we can still find e > 0 
such that Ufj^ — eu^' <Uy — euyi. The rest is as above. 

(ii) Finally, we treat the general case. As vri < /r and 112 < v, the 
measure vr is necessarily concentrated on / x J. We can cover I x J with a 
sequence {Qn)n>i of compact rectangles Qn ^ I x J and define measures vr” 
supported by Qn such that vr = each n, our construction in (i) 

yields a martingale transport plan vr"' <C € Wl(/U,zv), and then P = 
Y satisfies the requirement of the lemma. □ 

Corollary 3.4. The pair pL < 01 ^ is irreducible if and only if the n(/r, v)-polar 
sets and the v)-polar sets coincide. 


Proof, pL <c y is irreducible, the conclusion is an immediate consequence 
of the preceding lemma. Conversely, suppose that pL is not irreducible; 
that is, there exists x G M such that Ufj,{x) = Uy{x) and p.(—oo,x) > 0 and 
p.{x,oo) > 0. Then, the set (—oo,x) x {x,oo) is At(/r, i^)-polar (Proposi¬ 
tion 2.3) but not n(/i, zv)-polar (Proposition |3.l|. □ 


Proof of Theorem \3.^ By Proposition |2.3| and Corollary |3.4[ a Borel set B 
is M{p., zv)-polar if and only if i? C ( 1 ^ x J^) is n(/rfc, zv^)-polar for all /c > 1 
and n A is Po-nuH- The result now follows by applying Proposition |3.1| for 
each k > 1. □ 


4 A Generalized Integral 

4.1 Integral of a Concave Function 

Let /X <c zv be irreducible with domain (/, J) and let y : J —)• M be a concave 
functiorj^ We assume that / 7 ^ 0. Our first aim is to define the difference 
t{x) ~ ^ix)- Indeed, p.{x) and I'^x) nre well defined in [— 00 , 00 ) as y"*“ 
has linear growth, but we need to elaborate on the difference. There are 
(at least) three natural definitions, and we shall see that they all yield the 
same value. To that end, note that y is continuous on I by concavity, but 
may have downward jumps at the boundary J \I. We denote the absolute 
magnitude of the jump at y by |Ay(v/)|. 


(1) Approximation. Let In be a sequence of open, bounded intervals in¬ 
creasing strictly to I (i.e., I \ In has two components for all n) and 


^In fact, we will not need irreducibility for the results of this section, except for Exam¬ 


ple 4.5 and Remark 4.6 Moreover, we could allow y to take the value —00 on J \ I. 
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consider the concave, linearly growing functions Xn ^ M defined 

by the following conditions: Xn = X In and on J \ I, whereas Xn is 
affine on each component of I \ In, with continuous first derivative at 
the endpoints of In- Then, fJ,{Xn) and I'ixn) are both finite and we set 


:= lim [^(Xn) - i^(Xn)]- (4.1) 

n^oo 

We shall see below that the limit exists in [0, oo]. 

(2) Integration by Parts. Let —x" be the (locally finite) second derivative 
measure of the convex function — x on / and set 

1^2{x, P - 1^) ■= t; [ (Uf, - Uu) dx^' + [ \Ax\di2. (4.2) 

4 Ji Jj\i 

As Ufj, < Un and x < oO; this quantity is well defined in [0, oo]. 

(3) Disintegration. Fix an arbitrary P G Af(/r,i^) and consider a disin¬ 
tegration P = gi ® K] then we have J" x{y) dy) < xi^) for fi-a.e. 
X € I hy Jensen’s inequality. Thus, 


I^3ix,P 



x{y)K{x,dy) 


fj.{dx) 


is well defined in [0, oo], and we shall see below that this value is 
independent of the choice of P G Af(/x, u). This definition was already 
used in [7]. 


For future reference, let us recall the following fact about the second 
derivative measure x"'- after normalizing x and its left derivative x' such 
that x(a) = X^(fl) = 0 for some a € I (by adding a suitable affine function), 

X{y)= [ {y-t)~x''{dt)+ [ {y-tyx"{dt), yGl, 

where l,r € [— 00 , 00 ] are such that I = (/,r). If x is continuous at the 
boundary of /, this identity extends to 7 / G J by monotone convergence. 

Lemma 4.1. The values Xi{x, y — v) are well defined in [0, 00 ], depend only 
on X o-nd y — v, and coineide for i = 1,2, 3. 

Proof. By concavity, x is continuous on I with possible downward jumps at 
the boundary. Setting x := x on / and extending x to J by continuity, we 
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have X = X ~ l^x|lj \7 where x is concave and continuous. By linearity of 
the i/-integral, it suffices to show the claim for x; in other words, we may 
assume that x is continuous. 

Suppose first that x £ Id Then, it is clear that Xj(x,/i — z^) 

is well defined for i = 1,2,3 and that Ti(x, ^ — i') = ^six^ To see the 

equality with X 2 (x,/r — v), let a G / be arbitrary. Writing again I = (/,r), 
we have 

( X{s) - T^){ds) = [ [ {t-s)+x''{dt){fj.-i^){ds) 

J J J [I,a) J 

+ [ [ {s-tyx"idt){fi-i^){ds). 

J[a,r] J[a,r) 

Applying Fubini’s theorem to both integrals and noting that the integrands 
vanish on certain sets, this can be rewritten as 


'{l,a) Jj 


{t - s)+ {fi - v){ds) x"{dt) + 


'[a.r) J J 


[s-t)^ {^i-v){ds)x!'{dt). 


Substituting (t — s)"*“ = (s — t)^ +1 — s in the first integral and using that 
/i and V have the same mass and mean, this equals 


h 1}^ ~ ~ x!'{dt). 

On the other hand, using |s — t| = 2(s — — (s — t) yields that 


- Uu)it) = / |s - f| (|U - i'){ds) = 2 / (s - t)+ (/r - iy){ds) 

Jj J J 


It follows that X 2 (x, ^ — v) = X^{x, ~ that this value depends only 

on X and ^ — v. 


For general X) define Xn £ n as before (4.1); the above es¬ 


tablishes that the values of Ii{xn, ~ coincide for each n. Noting that Xn 
decreases to x stationarily and that Xn+i ~ Xn is concave, monotone conver¬ 
gence entails that liixn, A* — t Tj(x, /r — z^) for i = 2,3, and in particular 
these limits coincide. It now follows that the limit defining Xi(x, /r — z^) must 
exist and have the same value. □ 


Definition 4.2. We write (/r — v){x) for the common value of Xj(x,// — z^), 
i = 1,2,3. 
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As the notation suggests, we have (/i — i'){x) = /^(x) ~ ^(x) soon 
as at least one of the latter integrals is finite—this follows from the rep¬ 
resentation (4.1). However, it may happen that (/i — i^)(x) is finite but 
/i(x) = z^(x) = —oo. The following remark elaborates on this. 


Remark 4.3. Suppose that (/r — z^)(x) is finite; then /i(x) and ’^ix) are 
either both infinite or both finite. Thus, one sufficient condition for their 
finiteness is that the support of /r be a compact subset of I. A more general 
condition is the existence of a constant C > 1 such that 


Uu - us^ < C{uu - %), 


(4.3) 


where m G / is the barycenter of /x. Indeed, by (|4.2[), this implies that 


{Sm - i^)(x) < C{fi - u){x); 


thus, ix(x) > —oo if the right-hand side is finite. One can formulate a similar 
sufficient condition by substituting 6m with any measure p, satisfying p <c i' 
and p{x) > —oo. 


Example 4.4. Let /x, ix be Gaussian with the same mean and variances 
^ ^ Then, a direct computation shows that /x <c zx is irreducible and 


< al- 


Condition (4.3) is satisfied. 


It turns out that atoms at the endpoints of I are helpful in terms of 
integrability. 


Example 4.5. Suppose that p <c zx is irreducible with domain (/, J). If / 
is bounded and zx has atoms at both endpoints of I, then (4.3) is satisfied. 
Indeed, Uy > on / and the slopes are separated at the endpoints (cf. 
Remark 2.4) so that {uy — us)/{uy — u^) has a positive limit at the boundary. 
Using these two facts, (4.3) follows. 


Very much in the same spirit, we have the following estimate related to 
the preceding example. 


Remark 4.6. Let /x <c zx be irreducible with domain (/, J), let I have a 
finite right endpoint r and let x • M be a concave function such that 

x(a) = x'{^) = 0, where a G / is the common barycenter of /x and zx. In 
particular, x ^ 0 Xl[a,oo) is concave. If zx has an atom at r, then 

X{r)> - 7 ^/ Xd{fi-i^), (4.4) 

’^{{r}) J[a,oc) 
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with a constant C > 0 depending only on /i, v. 

Indeed, as in Example 4.5 Remark 2.4 implies that there exists C such 
that (4.3) holds on [a,oo). As a consequence, 

_^(r)zy({r}) < - / xdi^= xd{da-i^)<C xd{fJ,-i^), 

J\a,oo) J\a,oo) J\a,oo) 


where we have applied Lemma 4.1 to xl[a,oo)- 


4.2 Integrability Modulo Concave Functions 

Our next aim is to define expressions of the form in a situation 

where the individual integrals are not necessarily finite. We continue to 
assume that /i <c is irreducible with domain (/, J). 

Definition 4.7. Let y? : / — )> M and ip : J ^ M. he Borel functions. If 

there exists a concave function y : J —)• M such that (/? — y G 

V’ + y G we say that y is a concave moderator for (</?, V’) and set 


/r((^) + u{'ip) := fi{(p - y) + + y) + (m - i^){x) G (“OO, oo], 


where (/r — i^)(y) was introduced in Definition 4.2 


Remark 4.8. The preceding definition is independent of the choice of the 
concave moderator y. Indeed, suppose there is another concave function y 
such that if — X £ and ^ + y G then it follows that y — y G 


n L^{ic). Using, for instance, the representation (4.1), we see that 

- ^)ix) - - i^){x - y) = (/^ - ^){x) 


and now it follows that 


- y) + + y) + (m - J")(x) = Kv’ - y) + i^i^P + y) + (/^ - ^){x) 

as desired. 

Definition 4.9. We denote by v) the space of all pairs of Borel func¬ 
tions (^ : I —)> M and : J ^ which admit a concave moderator y such 
that (/r — i'){x) < oo. 

In particular, -|- I'iip) is well defined and finite for {(p, ip) G z^), 

and has the usual value if {p,ip) G x C L'^[p,u). 

The following sanity check confirms that p{(p) + r'i'tp) has the good value 
in the context of martingale transport. 
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Remark 4.10. Let G and let /i : / ^ M be Borel. If 

(^(x) + V’(y) + ~ x) is bounded from below on / x J, then 

^i{^) + = P[^{X) + ^P{Y) + hiX)iY - X)] 

for any P G Al(/r, z^). 

Proof. Let y be a concave moderator for {ip, ip). We may suppose that 0 is 
a lower bound, so that 

((^ - x){X) + {iP + x)iY) + xiX) - x{Y) + h{X){Y - X) > 0. 

As the first two terms are P-integrable, the negative part of the remaining 
expression is P-integrable and P[p{X) -|- ip{Y) + h{X){Y — A)] equals 

-x) + + x)+ P[x{X) - x{Y) + h{X){Y - A)]. 

Let P = ^ ® K be a disintegration of P; then by the linear growth of the 
following integrals are well defined and equal, 

j[x{x) - x{y) + Kx){y - a;)] k,{x, dy) = j[x{x) - x{y)] I^{x, dy) 

for fi-a.e. x G I. As the negative part of y(A) — x(A) + h{X){Y — A) is 
P-integrable, Fubini’s theorem (for kernels) yields 


P[x{X) - x{Y) + h{X){Y -X)] = jj [x(x) - x{y)] k{x, dy) n{dx) 


and the right-hand side equals 


{p — r'){x) by Lemma 4.1 


□ 


5 Closedness on an Irreducible Component 

In this section, we analyze the dual problem on a single component; that is, 
we continue to assume that ^ <c is irreducible with domain {I, J). 

Definition 5.1. Let f : I x J ^ [0,oo]. We denote by P^{{f^{f) the set of 
all Borel functions {p, ip,h) :M—such that (p, ip) G L'^{pi, v) and 

p{x) + ip{y) + h{x){y - x) > /(x, y), (x, y) £ I x J. 

Moreover, we denote by (/) the subset of all {p, ip, h) G (/) with 

p G L^{y) and ip G L^{i'). 
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We emphasize that in this dehnition, the inequality is stated in the point- 
wise (“pw”) sense. For later reference, we also note that there are two degrees 
of freedom in the choice of ((/?, ip, h). Namely, given constants ci, C 2 G M, the 
triplet (cp, Ip, h) belongs to (/) if and only if the the triplet 

(p{x) = ip{x) + ci + C 2 X, '0(y) = V’(y) - Cl - C 2 y, h{x) = h{x) + C 2 {h.l) 

does, and then -|- i^i^p) = /r((^) -|- 

The goal of the present section is the following closedness result for 
T)‘pppPP’ (/); it is at the very heart of our duality and existence theory. 

Proposition 5.2. Suppose that p, <c is irreducible with domain 
let f,fn- I X J —>■ [0, oo] be such that fn^f pointwise and let {ipnj'ipm hn) G 
satisfy sup„{/i((/?n) + t^itpn)} < oo. Then, there exist 

{(p,ip,h) G T)^fPf’{f) such that p{ip) + ^{ip) < liminf{|U((/?„) n{'ipn)}- 

The irreducible pair p <c n is hxed for the rest of this section, so let us 
simplify the notation to 

As a hrst step towards the proof of Proposition |5.2[ we introduce concave 
functions which will control simultaneously (pn and ipn, in the sense of one¬ 
sided bounds. 

Lemma 5.3. Let {(p,'ip,h) G P‘^(0). Then, there exists a concave moderator 
X : J —t M for (cp, ip) such that 

X < fp on I, —X < pj on J. 

In particular, fi{x) + z^(—x) < hif) + 

Proof. The function 

X{y) := inf [p{x) + h{x){y - x)], y€J 

xei 

is concave as an inhmum of affine functions, and (p, tp) G L^{pL, v) implies 
that (^ < oo on a nonempty set, so that x < co everywhere on J. Moreover, 
we clearly have x^ P on I. Our assumption that 

p{x) + 'ip{y) + h{x){y-x)>D, {x,y) £ I x J (5.2) 

shows that x ^ ~'P on J. Since {p, ip) G L^{pL, v), the set {ip < oo} is dense 
in supp(i^), and by concavity it follows that x > ~oo on the interior of the 
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convex hull of supp(i^); that is, on the interval I. Moreover, {^jJ < 00 } must 
contain any atom of v and in particular J \ I, so that y > —00 on J. 
Setting (^ := (/J — X > 0 and ^ := '0 + X ^ 0) "w® can write ( |5.2[ ) as 

(fix) + '4>{y) + [x(x) - x(y)] + h{x){y - x) > 0, (x, y) £ I x J. 


Let P = K he a disintegration of some P £ For fixed x £ I, all 

four terms above are bounded from below by linearly growing functions. It 
follows that for y-a.e. x £ I, the integral of the left-hand side with respect 
to k{x, dy) can be computed term-by-term, which yields 


P{x)+ / ' 4 ){y)K{x,dy)+ j [x{x) - x{y)] , dy). 


J [x(a 


These three terms are nonnegative, and thus the integral with respect to y 


can again be computed term-by-term. By Fubini’s theorem and Lemma 4.1 
it follows that 


P[(^(X) + ^(y) + [x(y) - x(y)] + HX) (Y-X)] = y{^) + + {y-u) (x). 

(5.3) 

Of course, the left-hand side is also equal to P[ip{X) Pip{Y) + h{X){Y — X)] 
and therefore finite by Remark |4.10 Thus, the right-hand side is finite as 
well. As a result, (ip, ip) £ L^{p., i^) with concave moderator X) and 

y{p) + u{ip) = y{(p) + u{ijj) + iy- v){x) > (y - i^)(x) = Kx) + i^(-x) 

as desired. □ 


Let us record a variant of the preceding construction for later use. 

Remark 5.4. Let {p, ip,h) : M —)> (— 00 , 00 ] x (— 00 , 00 ] xM be Borel functions 
such that 


p{x) + ip{y) + h{x){y - x) > 0, (x, y) £ I x J. 


Then, {p, ip) £ v) if and only if P[p{X) + ip{Y) + h{X){Y — A)] < 00 
for some (and then all) P £ Xi{p,u). 


Proof. The “only if” statement is immediate from Remark 4.10[ For the 
converse, let P[p{X) -|- ip{Y) + h{X){Y — X)] < 00 for some P £ M.{p, v)\ 
then p is finite p-a.s. and ip is finite v-a.s. We can then follow the proof of 
Lemma 5.3 up to (5.3) to define a concave function x ■ J ^ such that 
p := p — X P h and ip := ip + x^h and 


P[piX) -h ip{Y) + h{X){Y - X)] = p{p) + u{ip) + {p- v){x)- 
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Since the left-hand side is finite, the three (nonnegative) terms on the right- 
hand side are finite as well; that is, (</?, V’) £ with concave modera¬ 
tor X- □ 

Our second tool for the main result is a compactness principle for concave 
functions. Irreducibility is crucial for its proof, so let us restate this standing 
condition. The notation Xn refers to the left derivative (say). 

Proposition 5.5. Let n <c be irreducible with domain (/, J) and let a £ I 
be the common barycenter of /i and v. Let Xn ■ J be concave functions 
such that 


Xn{a) = Xn(a) = 0 and sup(|U - i^)(xn) < oo. 

n>l 


There exists a subsequence Xn^ which converges pointwise on J to a concave 
function x : T —)• M, and (p, — i'){x) < liminffc(^ — iy)(xnk)- 


Proof. By our assumption, (/r — v){Xn) is bounded uniformly in n. In view 
of (4.2), this implies that there exists a constant C > 0 such that 


0 < y (“m “ '^i') < C and 0 < |Axn| < C, 


where we have used that J \ I consists of (at most two) atoms of u and 
I^Xnl = 0 on /. By the same fact, we thus have 


lim|AxnJ = liminf |Axn| (5.4) 

k n 

for a suitable subsequence XnP-, we may assume that Uk = k. Moreover, 
the first inequality shows that the sequence of finite measures defined by 
(tt^ — Uiy) dXn is bounded and thus relatively compact for the weak topology 
induced by the compactly supported continuous functions on L. Recalling 
that Ui, — Ufj, is continuous and strictly positive on I, it follows that (—Xn) 
is relatively weakly compact as well. In view of Xn(®) = 0) tiri® implies 
a uniform bound for the Lipschitz constant of Xn on any given compact 
subset of I. Using also Xn(fl) = 0, the Arzela-Ascoli theorem then yields a 
function x : M such that Xn ^ X locally uniformly, after passing to a 

subsequence. Clearly x is concave, and integration by parts shows that —Xn 
converges weakly to the second derivative measure —x" associated with x- 
Approximating — u^, from above with compactly supported continuous 
functions on /, we then see that 

{t-^){x) = 7 , dx” < liminf ^ / [u^-Uu) dXn = liminf(/r-i/)(xn)- 

Z J J n^oo Z J J n^oo 
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Together with (5.4), we can define x on J and the result follows via (4.2). □ 


We can now derive the main result of this section. 

Proof of Proposition's^ Since {<Pn,'f’n-,hn) G P^(/n) and /„ > 0, we can 


introduce the associated concave functions Xn as in Lemma 5.3 Normalizing 
{(fn, V’n; hn) as in (5.1) with suitable constants, we may assume that Xn(fl) = 
Xni^) — 0; note that the relations Xn < Pn and —Xn < V’n are preserved. 


After passing to a subsequence, Proposition 5.5 then yields a pointwise limit 
X : J —)• M for the Xn- 

Since ^ Xn ^ Xi Komlos’ lemma (in the form of |19[ Lemma Al.l] 
and its subsequent remark) shows that there are (pn £ conv{(/j„, ^Pn+i, ■ ■ ■} 
which converge ii-a.s., and similarly for tpn- Without loss of generality, we 
may assume that (pn = (pn, and similarly for ifn- Thus, setting 

(y9 := limsup(^„ on /, V’■= limsup'f/’n on J 
yields Borel functions xjj such that 


PnP /i-a.s., — X ^ 0 and tpn ^'‘P I'-a.s., + X > 0- 

Fatou’s lemma and Proposition |5.5| then show that 

p{p-x) + + x) + {p- z^)(x) 

< lim ini fi{ipn - Xn) + liminf zz(V’n + Xn) + liminf(|U - iy)(xn) 

< liminf[//((/?„ - Xn) + l^i'Pn + Xn) + {p- v){Xn)] 

= liminf[|u((/9„) + zz('0n)] < oo. 

In particular, this shows that {ip, fj) G T‘^(/i, i^) with concave moderator X) 
and then the above may be stated more concisely as 

H{ip) + u{'p) < liminf pL{pn) + z^(V’n)- 

It remains to find h. For any function g : J ^ let : J —)• M denote 
the concave envelope. Given a sequence of such functions gn, we have 

liminf(ff“'^'=) > (liminf 

as > g-n and lim inf is concave. Moreover, {ipm'il’n, hn) G V^{fn) 
means that Pn{x) + hn{x){y — x) > fn{x,y) — tfniy) which implies that 

Pn{x) + hn{x){y -x)> [fn{x, ■) - V'n]“”''(y), {x,y) ^ I X J. 
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Fix X £ I] then these two facts yield 


+ hn{x){y - x)] > liminf[/n(x, •) - 

> [liminf(/„(x, •) - V'n)]“”‘'(2/) 

>[/(x,-)-V’]“"'^(y) 

=: 0{x,y) 

for all y £ J, and for the specific choice y = x we obtain that 
if{x) > liminf (/9„(x) > ip{x,x). 

As h'l'ip = 00 } = 0 and / > — 00 , we have if{x, y) > —00 for all y £ J. 
If X ^ N := {(p = 00 }, the above inequalities also show that (p{x,x) < 00 
and as a result, the concave function p{x,-) is finite on J and admits a left 
derivative 

h{y) := drp{x,-){y) £^, y £ I. 

By concavity, it follows that 

p{x) + h{x){y -x)> (p(x, x) + h{x){y - x) > (p(x, y) > f{x, y) - il){y) 

for all y £ J. Setting h := 0 on N, we then have ip{x)+ ^{y) + h{x){y — x) > 
f{x, y) for all (x, y) £ I x J, because the left-hand side is infinite for x £ N. 
Thus, (v?, V’) h) £ T>'^{f) and the proof is complete. □ 

6 Duality on an Irreducible Component 

Let fi <c i^Fe irreducible with domain (I, J). We define the primal and dual 
values as follows. 

Definition 6.1. Let / : —)• [0, 00 ]. The primal problem is 

sup P{f) G[0,oo], 

PeMiiM,u) 

where P{f) refers to the outer integral if / is not measurable. The dual 
problem is 


■= ^ ^ [0>oo]- 
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The goal of this section is the following duality result; it corresponds to 
our main result in the case of irreducible marginals. We recall that a function 
/ : —>■ [0, oo] is called upper semianalytic if the sets {/ > c} are analytic 

for all c G M, where a subset of is called analytic if it is the (forward) 
image of a Borel subset of a Polish space under a Borel mapping. Any 
Borel function is upper semianalytic and any upper semianalytic function is 
universally measurable; see, e.g., m Chapter 7] for background. 

Theorem 6.2. Let pL<cV he irreducible and let / : —)■ [0,oo]. 


(i) If f is upper semianalytic, then = !)(())(/) G [0,oo]. 

(a) If lffl{f) < oo, there exists a dual optimizer {ip,'il>,h) G H‘^^{f)- 


The proof of Theorem 6.2 is based on PropositionChoquet’s theorem 
and a separation argument, so let us introduce the relevant terminology. Let 
[0, oo]®^ be the set of all functions / : —)• [0, oo], let USA+ be the sublat¬ 

tice of upper semianalytic functions and let U be the sublattice of bounded 
upper semicontinuous functions; note that IL is stable with respect to count¬ 
able infima. A mapping C : [0, oo]®^ —)• [0, oo] is called a ^-capacity if it 
is monotone, sequentially continuous upwards on [0, oo]® , and sequentially 
continuous downwards on U. 

We write S(/) := S^,[/(/) and I(/) := lf^u{f) for the rest of this section; 
both of these mappings will turn out to be capacities. 


Lemma 6.3. The mapping S : [0,oo]® —?• [0,oo] is a lA-capacity. 


Proof. Since v) is weakly compact, this follows by the standard argu¬ 
ments presented, e.g., in |33l Propositions 1.21, 1.26]. □ 


Next, we show the absence of a duality gap for upper semicontinuous 
functions. This result is already known from [5l Corollary 1.1] which uses a 
minimax argument and Kellerer’s duality theorem |33) for classical transport. 


We shall give a direct and self-contained proof based on Proposition 5.2 
Lemma 6.4. Let f G Li; then S(/) = I(/). 


Proof. Let / : —)> [0, oo] be bounded and upper semicontinuous; then the 

inequality 

S(/)<I(/) (6.1) 


follows from Remark 4.10 


Below, we show the converse inequality. 
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(i) We first prove the result for a class of continuous reward functions. 
This will be a Hahn-Banach argument, which requires us to introduce a 
suitable space. 

Recall that /i has a finite first moment. Thus, by the de la Vallee-Poussin 
theorem, there exists an increasing function : M_|_ —>■ M+ of superlinear 
growth such that x i—)• is /i-integrable. The same applies to v, and we 

set 

C{x,y) = 1 + CM(kl) + Cu{\y\), {x,y) e 

Let Cq = be the vector space of all continuous functions / : —>■ M 

such that f /Q vanishes at infinity; this includes all continuous functions of 
linear growth. We equip with the norm |/|^ := |//C|oo) where | • |oo is the 
uniform norm. 

Let / e Cq. Then, setting lpq{x) = (^(^1) and V’o(y) = C/^dyl)) have 

-c(l + + V’o) < / < c(l + <po + V’o) 

for some constant c, showing in particular that S(/) is finite. Thus, we may 
assume that S(/) = 0 by a translation. Consider the set 

K = {g^C^-. 1(g) < 0}. 

This is a convex cone in and Proposition |5.2| implies that K is closed; 
here we use that a convergent sequence in Cq is uniformly bounded from 
below by a function of the form —c(l + (pQ + ^/^q). 

Assume for contradiction that !(/) > 0; that is, f ^ K. Then the Hahn- 
Banach theorem and the cone property yield a linear functional i € 
such that i{K) C M_ and i{f) > 0. We will argue below that I can be 
represented by a finite signed measure vr. Note that t{K') C M_ and the 
fact that K contains all functions of the form y:>{X) — y{(p) with p G 
imply that i(ip{X)) = for all p G C'fe(M); i.e., /x is the first marginal 
of TT, and similarly ly is the second marginal. Thus, vr G n(;U, v). Moreover, if 
h G C'ft(M), then the function h{X){Y — X) is in Cq due to its linear growth, 
and a scaling argument shows that i{h{X){Y — A)) = 0. This implies that 
vr is a martingale transport; i.e., vr G Ad(/x, zv). But now vr(/) = i(f) > 0 
contradicts S(/) = 0, and we have shown that I(/) < S(/). 

It remains to argue that can be represented by finite signed measures. 
Indeed, / i—)• //(^ is an isomorphism of normed spaces from to the usual 
space C'o(M^) of continuous functions vanishing at infinity with the uniform 
norm. By Riesz’ representation theorem, any continuous linear functional 
on C'o(M^) can be represented by a signed measure m, and hence any i G 
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can be represented as £{f) = rn{f/(). Using 1/C G Co(M^) C L^(m) as a 
Radon-Nikodym density, i is thus represented by the finite signed measure 

dfh = (1/C) dm. 

(ii) Let / be bounded and upper semicontinuous, then there exist fn € 
U C(^ decreasing to / and we have S{fn) = I(/n) for ah n by 
part (i) of this proof. As S{fn) —>• S(/) by the decreasing continuity of S, 
cf. Lemma 6.3, it remains to show that I(/n) —t !(/)■ Since / < fn, we have 
I(/) < I(/n) for all n. On the other hand, (6.1) shows that 

limI(/0=limS(/0 = S(/)<I(/) 

and this completes the proof. □ 


Our last preparation for the proof of Theorem 6.2 is to show that I is 
a capacity; again, this is a consequence of the closedness result in Proposi¬ 
tion 15.21 


Lemma 6.5. The mapping I : [0,oo] 


[0,oo] is a U-capacity. 


Proof. As I = S on L/ by Lemma 6.4 Lemma |6.3| already shows that I 
is sequentially continuous downwards on U. Let /, fn G [0, oo]® be such 
that fn increases to /; we need to show that I(/n) —t !(/). It is clear 
that I is monotone; in particular, !(/) > limsupl(/,i), and I(/n) —> I(/) if 
SUPn I(/n) = OO. 

Hence, we only need to show !(/) < lim inf I(/n) under the condition that 
sup„ I(/n) < OO. Indeed, by the definition of I(/n) there exist (ipn, 'f’n, hn) £ 
ifn) with 

hipn) + < I(/n) + 1/n. 

Proposition 5.2 then yields G with 

H{(p) + < lim inf [!(/„) -\-l/n], 

showing that !(/) < liminf I(/,i) as desired. □ 

We can now deduce the main result of this section. 


Proof of Theorem 6.2 (i) In view of Lemma 6.3, Choquet’s capacitability 
theorem shows that 

S(/) = sup{S( 5 f) : £/G L/, 5 </}, / G USA+. 

By Lemma |6.5t the same approximation formula holds for I, and as S = I 


on lA by Lemma 6.4 it follows that S = I on USA+. 

(ii) To see that the infimum is attained when it is finite, it suffices to 
apply Proposition |5.2|with the constant sequence fn = f- D 
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7 Main Results 


7.1 Duality 

Let /U <c be probability measures in convex order and let / : —)• [0, oo] 

be a Borel function. We continue to denote the primal problem by 

S/,,i/(/) := sup P(/), 

as in the irreducible case. Some more notation needs to be introduced for 
the dual problem. Let us hrst recall from Proposition |2.3| the decompositions 

k>0 k>0 

where fj,k <c t'fc is irreducible with domain (J^, J^) for k > 1 and fiQ = z^o- 
Moreover, Pq denotes the unique element of vq). 

Let {ip, 'ip,h) : M —>■ M X M X M be Borel. Since Pq is concentrated on the 
diagonal A, we have 

ip{X) + V'(y) + h{X){Y -X) = ip{X) + V’(X) Po-a.s.; 

that is, the function h plays no role and ip, ip enter only through their sum. In 
fact, the dual problem associated to (/Uoi^^o) is trivially solved, for instance, 
by setting ip{x) = f{x,x) and z/; = 0. There is no need to use integrability 
modulo concave functions, but to simplify the notation below, we set 

L^{po,uo) ■■= {(</?, V') : + e 

and iJ.o{ip) P vq{'4)) := /ro(v^ + V') for {ip,^}) G L'^{pq,vo). Moreover, 'D'p^g{f) 
is the set of all {ip, ip, h) with {ip, ip) G L‘^{pq, pq) and 

ip{x) + ip{x) > f{x,x), X G Iq. 

Finally, it will be convenient to dehne := Po{f) = po{f{X,X))- 

We can now introduce the domain for the dual problem on the whole real 
line. 

Definition 7.1. Let L'^{p, v) be the set of all Borel functions </?, -0 : M — >■ M 
such that {ip, Ip) G L‘^{fik: k'k) for all A: > 0 and 

\fJ>k{p’) + i^k{'tp)\ < oo. 

fc >0 
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For (v?, V’) £ we define 


fi{ip) + iy{ip) := ^{^fc((y9) + z^fc(V')} < oo, 
k>0 


and is the set of all Borel functions {tp, ipjh) :M—)>MxMxM such 

that G L'^(/i, z/) and 

piX) + V’(y) + h{X){Y -X)> f{X, Y) Mip, u)-q.s. 


Finally, 


■= , , + e [0,oo]. 

if) 


We emphasize that the dual domain 'D'^ ^^(f) is now dehned in the quasi- 
sure sense. Before making precise the correspondence with the individual 
components, let us recall that the intervals Jk may overlap at their endpoints, 
so we have to avoid counting certain things twice. Indeed, let {pk, '4’k,hk) G 
T^^AAkif)- If ’^k contains one of its endpoints, it is an atom of v and hence 
is hnite on Jk\Ik- Translating '0^ by an affine function and shifting ipj- and 


hk accordingly, cf. (5.1), we can thus normalize {(pkAk: h^) such that 


= 0 on Jfc \ 4 


(7.1) 


On the strength of our analysis of the AI(/x, z/)-polar sets, the dual domain 
can be decomposed as follows. 

Lemma 7.2. Let / : —)• [0, oo] be Borel, let p <c v and let pk^^k o,s 

in Proposition\2.3[ 


(i) Let {pkAk,hk) G if) fof k normalized as in (|7.1[), and let 


Poix) = fix, x) and ipo = 0. If J2k>o{LiPk) + i^Ak)} < oo, then 


P-='^PkIh, V’:= 

k>0 k>l k>l 


satisfies ip, fi, h) G and pip) + iz{ip) = J2k>o hkiPk) + i^kiA)- 

(a) Conversely, let ((^, G H'^uif)- ^fiar changing p on a p-nullset 
and fi on a v-nullset, we have ip,fi,h) G h’^ff^if) k >0 and 

J^iLkip) + MA} = hip) + AA < oo. 

k>0 
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Proof. In essence, this is a direct consequence of Proposition 2.3 and Theo¬ 
rem 3.2 For (i), we note that ni^Pk) + z^(V’fc) ^ 0 foi' k, so that the sum 
is always well defined. Regarding (ii), let B be the polar set of all (x,y) 
such that ^p{x) -|- ipiy) + h{x){y — x) < f{x, y)\ note that B is Borel because 
all these functions are Borel. Then for each k > 1, the set B n {Ik x Jk) is 
contained in a union {N^ x M) U (M x Njf), where is /r-null and is 
i/-null. We then set (/? = oo on Uk>iN^ as well as on the /iQ-nullset B H Aq. 
Proceeding analogously with ijj, we obtain the desired properties. □ 


Remark 7.3. (i) Suppose that /r <c is irreducible. Then, Lemma 7.2 
implies that the pointwise and the quasi-sure formulation of the dual problem 
agree: 

C(/) = i,Af) 

if / = 0 outside the domain (I, J), and otherwise the difference is Po{f) due 
to our definitions. Without the irreducibility condition, the formulations 
may differ fundamentally; cf. Example |8.1| 

(ii) As a sanity check on our definitions, we note that 

//((^) + i.(V') = PM^) + AY) + h{X){Y - X)], P€ M{fi, v) 


7.2 


whenever (ipAA) ^ ^(/) for some / > 0, as a consequence of Lemma 

and Remark 14.101 

We can now state our main duality result. 

Theorem 7.4. Let / : —)■ [0, oo] be Borel and let p <c f. Then 

S^,.(/) =!;.,.(/) G [0,cx)]. 

If l^^uif) < oo, there exists an optimizer {ip,fj,h) G forl^,u{f)- 

Proof. We first show that S^^^if) A I-fi,u{f)- To this end, we may as¬ 
sume that l^^uif) < oo, so that there exists some {ip,fj,h) G B^^Af)- 

this induces {ip,'if,h) G VAlAif)^ duality result of 


Lemma 7.2 


Theorem |6.2| yields that 

^r^Af) < X] < '^{Lk{p) + = At) + AA < oo. 


A:>0 


A:>0 


The claim follows as {(p,^),!) G B^^Af) arbitrary. 


28 









Next, we prove that S^^i/(/) > I^^,y(/), for which we may assume that 
S/j,i/(/) < oo. Then < oo for all A: > 0 and by Theorem 6.2 there 

exist {(pk,'il^k,hk) G such that 

S^^,u{f) = + i^kiil^k)}- 

k>0 k>0 

With the induced (cp, ip,h) € T)^ ^{f) as in Lemma 


7.2 


it follows that 


Sf_,Af) = + ^(V') > W(/) ^ 

which shows both the claimed inequality and that ((p,ijj,h) G 
optimal for □ 

Some remarks on the main result are in order. 


Remark 7.5. The lower bound on / in Theorem 7.4 can easily be relaxed. 
Indeed, let / : —>■ M be Borel and suppose there exist Borel functions 

h) :M— such that ip G G A{v) and 


/(W, F) > ip{X) + V^(y) + h{X){Y - X) M{p., u)-q.s. 


Then, we may apply Theorem |7.4| to 

/ := [/(X, y) - ip{X) - ^(Y) - h{X){Y - X)]+ 

and the conclusion for / follows, except that now S^^uif) = has values 

in (— 00 , 00 ]. However, the lower bound cannot be eliminated completely; cf. 


Example 8.6 


We recall that in general, the duality theorem can only hold with a re¬ 
laxed notion of integrability; cf. Examples |8.4| and |8.5| We have the following 
sufficient condition for integrability in the classical sense. 


Remark 7.6. Suppose that for each A: > 1, either p,k is supported on a 
compact subset of Ik or 


for some constant Ck, where ruk is the barycenter of pk- Then, 

%,.uPf)=Vl..Pfh k>l 


and in particular the optimizer in Theorem 7.4 satisfies ip G A{p) and 
^|J G L^iA- Indeed, Remark 4.3 shows that all concave moderators can be 
chosen as y = 0 in this situation. 
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Remark 7.7. In the setting of Theorem 7.4 and the notation of Proposi¬ 
tion |2.3| and Lemma |7.2[ the following relations hold. 


(i) We have S^,^(/) = and I;,,^(/) = Efc>o 

(ii) If Pk G M{nk,i^k) is optimal for for all k > 0, then P G 

A4(/i, z^) is optimal for S^^j/(/). If S^^i/(/) < oo, the converse holds as 
well: if P G Al(/x, z/) is optimal for S^^;y(/), then Pk G Ai{fj,k,i^k) is 
optimal for for all fc > 0. 

(hi) If {ipk,ipk,hk) G is optimal for 1^^,£.;,(/) for all fc > 0, then 

z/z, h) G is optimal for I^^,y(/). If < oo, the converse 

holds as well. 


7.2 Monotonicity Principle 

An important consequence of the duality is the subsequent monotonicity 
principle describing the support of optimal transports; its second part can 
be seen as a substitute for the cyclical monotonicity from classical transport 
theory. While similar results have been obtained in [Tj Lemma 1.11] and 
|43[ Theorem 3.6], the present version is stronger in several ways. First, it is 
stated with a set T that is universal; i.e., independent of the measure under 
consideration; second, we remove growth and integrability conditions on /; 
and third, the reward function is measurable rather than continuous. 

Corollary 7.8 (Monotonicity Principle). Let / : —>■ [0,oo] be Borel, let 
fj. <c k' be probability measures and suppose that S^^,y(/) < oo. There exists 
a Borel set P C with the following properties. 


(i) A measure P € A4{p,, v) is eoncentrated on P if and only if it is optimal 
for 


(ii) Let n<cV be probabilities on M. If P € A4(/i, z^) is eoneentrated on P, 
then P is optimal for S^^p(/). 

If {ip,'i(,h) G *'5 o suitabl^ version of the optimizer from Theo- 

rem\7.4\ then we ean take the following set for P, 


{{x,y) G : (p{x) + fjiy) + h{x){y - x) = /(x,y)} n f AU [J 4 x J^V 

k>l ^ 


^chosen as in Lemma 


7.2 


(ii) 
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Proof. As < oo, Theorem 7.4 yields a dual optimizer 


(<y9, h) G and we can define T as above. By Remark 


4.10 


P'if) < P'i^iX) + f^iY) + hiX)iY - X)] = 


(7.2) 


for all P' G whereas for P G A4(/r, zz) with P(r) = 1, the same 

holds with equality. This shows that P{f) = S^^,y(/). For the converse 
in (i), we observe that the inequality in (|7.2[) is strict if P'(r) < 1, and then 


shows that P' cannot be a maximizer. 

For the proof of (ii), we choose a version of G as in 

Lemma 7.2 (ii); moreover, we may assume that P{f) < oo. We shall show 


that G once this is established, the proof of optimality is 

the same as above. 

(a) On the one hand, we need to show that 


^iX)+fjiY) + h{X){Y-X)>f{X,Y) M(/i,i^)-q.s. (7.3) 


For this, it suffices to prove that the domains of the irreducible components 
oi p, <c are subsets of the ones of /r <c zz; i.e., that n^(x) = Ui,{x) implies 
Ufi{x) = Up{x), for any x G M. Indeed, let u^{x) = Uu{x). Since P is 
concentrated on F C A U Ufc>i x Jk-, we know that Y > x P-a.s. on the 
set {X > x}. Writing E[-] for the expectation under P, it follows that 

E[\X - x\lx>.] = E[{X - x)lx>x] = E[{Y - x)lx>x] = E[\Y - x\lx>x], 

where we have used that Fl[y|A] = X P-a.s. An analogous identity holds 
for {X < x}, and thus 

Ufi{x) = E[\X - x|] = E[\Y - x|] = up{x) 


as desired. 

(b) On the other hand, we need to show that (<y9, ip) G T'^(/2, T'). By reduc¬ 
ing to the components, we may assume without loss of generality that {p, v) 
is irreducible with domain (I, J). As ((/?, ip, h) G 'DufJ "(/) and P(r) = 1, we 


implies that {(p, ip) G L^{p, y) as desired. □ 

We note that the dual optimizer [p, ip, h) need not be unique, and a 
different choice may lead to a different set F. Moreover, we observe that an 
optimal P G M{fi,y) need not exist. However, the following yields a fairly 
general sufficient criterion in the spirit of [9j. 


have P[p{X) + ip{Y) -|- h{X)(Y — A)] = P{f) < oo, and now Remark 


5.4 
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Remark 7.9. Let / : —?• [0, oo] be Borel, let <c be probability 

measures and suppose that S^^u{f) < oo. Suppose there exist a Polish 
topology T on M and a function / : —>■ [0, oo] such that / is upper 

semicontinuous for t ® t and f = f A4{n,u)-q.s. Then, there exists an 
optimal P G for S^^i/(/). 

Indeed, the induced weak topology on does not depend on the 

choice of r; cf. [H Lemma 2.3]. Thus, under the stated conditions, the 
mapping P i—)• P{f) is upper semicontinuous on the compact set A^(|U,z^), 
and the result follows. We remark that compactness need not hold if non¬ 
product topologies are considered on M^, hence the use of t 

The flexibility of choosing r allows us to include a broad class of functions. 
Consider for instance / of the product form f(x,y) = fiix)f 2 {y), where fi 
and /2 are Borel measurable, or more generally any continuous function of 
/i(x) and f 2 iy)- Then, we can choose r such as to make / continuous (cf. 
the proof of O Theorem 1]) and the above applies. 


Remark 7.10. Corollary |7.8| is a version of the classical “Fundamental The¬ 
orem of Optimal Transport,” see e.g. |2l Theorem 2.13], where T is the graph 
of the c-superdifferential of a c-concave function, the so-called Kantorovich 
potential (here c = — / is the cost function). In our context, the roles of ip 
and V’ are not symmetric, and it is ip that constitutes the analogue of the 
Kantorovich potential. Indeed, ip and h can easily be obtained from p by 
taking a concave envelope and its derivative, respectively; see the end of the 


proof of Proposition 5.2 


8 Counterexamples 


In this section, we present five counterexamples. Examples |8 .1 1 and 8.2 show 
that the duality theory fails in the pointwise formulation; i.e.. 


ip{x)+ p{y) + h{x){y - x) > f{x,y) fora// (x,y)EM^ 


and thus justify our quasi-sure approach. The subsequent two examples 
demonstrate that a relaxed notion of integrability is necessary for the dual 
elements, and the final example shows that duality fails if / does not have 
any lower bound. 

Our first example shows that a duality gap may occur with the pointwise 
formulation of the dual problem. 
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Example 8.1 (Duality Gap in Pointwise Formulation). We exhibit a situa¬ 
tion where 

(i) the reward function / is bounded; 

(ii) a primal optimizer exists; 

(iii) if the dual problem is formulated in the pointwise sense, dual optimizers 
exist but there is a duality gap. 

Indeed, let ^ be the restriction of the Lebesgue measure A to [0,1]. Setting 
u = fi, the set Af (/i, v) has a unique element, the law Pq of x (x, x) under 
/i, which is nothing but the uniform distribution on the diagonal of the unit 
square [0,1]^. Consider the bounded reward function f{x,y) := which 
is lower (but not upper) semicontinuous. Since Pq is concentrated on the 
diagonal, the primal value of the problem is 

sup P^[/] =P^o[/] =0. 

Now let (/9, 'll;, h be Borel functions such that 

ip{x) + h{x){y - x) > f{x,y) for all x,yG[0,l]; 

then in particular 


(p{x) -|- 'tp{y) + h{x){y — x) > 1 for all x ^ y £ [0,1]; 


Let e > 0. By Lusin’s theorem, there exists a Borel set A C [0,1] with 
A(A) > 1 — e such that the restriction iI)\a is continuous. Using another fact 
from measure theory m Exercise 1.12.63, p. 85], the set A can be chosen to 
be perfect; i.e., every point in A is a limit point of A. Now let x G A and let 
Xn £ A ho a. sequence of distinct points such that Xn ^ x. Then passing to 
the limit in 

(p{x) + 'IpiXn) + h{x){Xn “ x) > 1 


yields that 

(p{x) + ipix) > 1 for all X £ A. 

As e > 0 was arbitrary, it follows that A{x £ [0,1] : (p{x) + ip{x) > 1} = 1. 
In particular, y{(p) + I'i'ip) > 1- This bound is attained, for instance, by 
the triplet (^ = 1, = 0, h = 0, so that the dual problem in the pointwise 

formulation admits an optimizer and has value 1; in particular, there is a 
duality gap in the pointwise formulation. 
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The next example shows that in general, the pointwise formulation fails 
to admit a dual optimizer. Such an example was already presented in j5], 
using marginals with infinitely many irreducible components. The subse¬ 
quence example shows that existence may fail even with finitely many (two) 
components and in a reasonably generic setting. 

Example 8.2 (No Dual Attainment in the Pointwise Formulation). We 
describe a setting where 

(i) the reward function is continuous and the marginals are compactly 
supported (but not irreducible); 

(ii) there is no duality gap for either formulation of the dual problem; 

(iii) there is no optimizer for the pointwise formulation of the dual problem. 

We fix two measures ^ <c supported on (—1,1) such that there are two 
irreducible components with domains /i x Ji = (—1, 0)^ and I 2 XJ 2 = (0,1)^. 
Moreover, we assume that the origin is in the (topological) supports of /r and 
u] for instance, /i and zz could both be equivalent to the Lebesgue measure 
on (—1,1), or they could be discrete with atoms accumulating at the origin. 
The reward function / is any continuous function of linear growth such that 

/ = 0 on ( —1,0)^ U (0,1)^ and 

/ is not (/i X zz)-a.s. bounded from above by a linear function on 

(-1,0) X (0,1). 

An example is f{x,y) = \/Ml(-i,o)x(o,i)- 

Suppose for contradiction that {(p, il>, h) is a dual minimizer for the point- 
wise formulation; then 

pix) + ipiy) + h{x){y -x)>0, (x, y) E (-1,0)^ U (0,1)^. 

We have S^^jy(/) = 0 and as / is continuous with linear growth, there is no 
duality gap (even for the pointwise formulation); cf. |5l Corollary 1.1]. It 
follows that P[p{X) -|- ip{Y) + h{X){Y — X)] = 0 for all P E Al(/U, v) and 
thus 

p{X) + i;{Y) + h{X){Y - X) = 0 u)-q.s. 

Let and Ni, be the corresponding nullsets as in Theorem 
for I \ Nfj^ whenever I is an interval. Then 

p{x)+ip{y)+h{x){y-x) = 0, (x,y) E [(-l,0);,x(-l,0)i.] U [(0, l)^x(0,1),.] 


3.2 and write 
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and in particular, fixing an arbitrary xq G (0,1)^ yields 

V’(y) = -^xq) - h{xo){y - xo), y G (0,1)^,, 

so that ^ must be an affine function 'il){y) = a+y + (i+ on (0, l)i,- It then 
follows that h = —a+ on (0,1)^ and (p{x) = —a+x — on (0,1)^, and a 
similar argument gives rise to constants a-,d- for (—1,0). Now, spelling 
out the condition 

ip(x) + V’(y) + h{x){y -x)> f{x, y) 

yields 

(a_ - a+)y + {d- - d+) > f{x, y), {x, y) G (0, x (-1, 

(a+ -a-)y + {d+ - d-) > /(x, y), (x, y) G (-1, 0)^ x (0, 1)^. 

Since /(0,0) = 0 and 0 is an accumulation point of the intervals appearing 
on the right-hand side, it follows that d- = d_|_, but then it follows that / is 
{fi X u)-a.s. bounded from above by a linear function on (—1,0)^ x (0,1),^, 
and all the same for (0,1)^ x (—l,0)iy. This is the desired contradiction. 

Remark 8.3. Nothing essential changes in Example |8. 2 1 if u has one or more 
atoms at the boundary of the intervals 1^. As a matter of fact, the example 
suggests that one can expect non-existence for the pointwise formulation as 
soon as there are at least two adjacent irreducible components, the reward 
function is not Lipschitz where they touch, and the marginals exhibit some 
richness (in particular, have inhnite support). 

The next two examples concern the quasi-sure version of the dual prob¬ 
lem; i.e., the setting of the main part of the present paper, and in particular 
the notion of integral introduced in Section The hrst one shows that it is 
necessary to relax the notion of integrability in order to have existence for 
the dual problem 

Example 8.4 (Failure of Integrability for Optimizers). We exhibit a situa¬ 
tion where 

(i) the reward function / is bounded; 

(ii) primal and dual optimizers exist and there is no duality gap; 

(iii) whenever G ^ dual optimizer, ip is not y-integrable 

and ip is not z^-integrable. 
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Indeed, let (cj)j>i be a sequence of strictly positive numbers satisfying 
Ci = 1 such that the probability measure 



i>l 


has hnite hrst moment but inhnite second moment. Moreover, set 



i>l 


and note that the moments of then have the same property. Finally, our 
reward function is given by f{x,y) = 

We observe that 



i>l 


in particular, /i <c Moreover, let (p{x) = — tp{y) = and h{x) = —2x; 
then we have 

ip{x) + il^iy) + h{x){y - x) = -x^ + - 2x(y - x) = {x - yf > f{x, y) 

for all {x,y) G N x No, with equality holding on the set 

r := {(x,y) G N X No : y G {x — 1, x, x + 1}}. 

Since P is concentrated on F, it follows as in Corollary |7.8| that P G WI (/r, v) 
is a primal optimizer and G is a dual optimizer. One can 

observe that a concave moderator is given by x(y) = —y^. 

Now let ((/?, ?/), h) G be an arbitrary optimizer; then we must have 

ip{x) + Tpiy) + h{x){y - x) = f(x, y) P-a.s. 

and hence, by the dehnition of P, this equality holds for all (x, y) G F. It 
follows that 


ip(x) + ip(x — 1) — h{x) = 1 


for all X G N, ip{x) + yi(x + 1) + h{x) = 1 
^<f{x) + yi(x) = 0. 

In particular, y? = — ^ on N and 

2(p{x) — ip{x — 1) — (p{x + 1) = 2, X G N. 
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All solutions of this difference equation satisfy 

ip{x) = —x^ + hx + c, X G N, 

for some constants 6, c G M. In particular, ip~ is not /r-integrable and is 
not z/-integrable, and as a result, there exists no optimizer for in the 

class C 

The next example shows that without a relaxed notion of integrability, 
the dual problem may be infinite even if the primal problem is finite. 

Example 8.5 (Integrability Requirement Causes Duality Gap). We exhibit 
a situation where 


(ii 

(iii 


the reward function / is continuous; 
primal and dual problem are finite; 

the set is empty; in particular, there is a duality gap if 

is replaced by in the definition of the dual problem 


Let /U <c be as in Example 8.4 we now make the specific choice 

Ci = i~^C, i G N, 


where C is the normalizing constant. This ensures that /i and v have a first 
but no second moment. Moreover, the strict concavity of i i—>• implies 

that 

zGN. 

The associated potential functions satisfy on (—oo, 0]. If there were 

X > 0 with u^(x) = Uu{x), then as /r is the second (distributional) derivative 
of Ufj^j2, we would have z^({x}) > /u({x}), a contradiction. As a result, 

<c is irreducible with domain (/, J) given by / = (0, oo), J = [0, oo). 

For the reward function, we now consider 


/(x,y) = {x-yf 


As seen in in Example 8.4, setting y:>{x) = —x^, ilj{y) = y^ and h{x) = —2x 
yields {(pj'ijjjh) G T’() ,^(/) with concave moderator xiu) = fact, 

y{(p) + I'i'ilj) = P{^xj^y) < 1 in the notation of Example 8.4 and thus 
Sm,.(/) < 1- 
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Suppose that there exists some G Since /r <c is 

irreducible, Corollary |3.4| shows that every point in N x Nq is charged by 
some element of A4 (/r, z^) and hence 

^p{x)+^jJ{y) + h{x){y — x)>f{x,y) = x‘^+y‘^ — 2xy for all (x,?/)gNxNo. 

We see that z/z must have at least quadratic growth in y, and thus z/z ^ 
and ip ^ As a result, = 0 and the corresponding dual problem 

has infinite value, whereas the primal one satisfies 0 < < 1- 

Our last example shows that a duality gap may occur (even in the quasi- 
sure formulation) if / does not have any lower bound. This should be com¬ 
pared with |5] Theorem 1] which shows that there is no duality gap if / is 
upper semicontinuous with values in [— 00 , 00 ). 

Example 8.6 (Duality Gap Without Lower Bound). We exhibit a situation 
where 


(i) the reward function / takes values in [—oo,0]; 

(ii) primal and dual optimizers exist; 


(iii) there is a duality gap. 


Indeed, let p = A|[oq] be the restriction of the Lebesgue measure to [0,1], 
fix a constant A > 0 and 


u = 



Then p <c is irreducible with domain given by / = J = (—A, 1 -|- A). 
Indeed, a particular element of Ai{p, y) is given hy P = fi 0 k, where 


k{x) ^ A T z^x+a) • 
For the reward function, we choose 


f{x,y) 


0 if |x — y| < A, 
< —1 if \x — y\ = A, 
_ —00 if \x — y\ > A. 


We first analyze the primal problem. Let P' G AI(/i, y) and let P' = 
be a disintegration. We observe that 


y (x — y)^ k'(x, dy) y,{dx) = Var(z/) — Var(y) = A^. 
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If P'{f) > —oo, then k'{ x){\x — y\ > A} = 0 for ^-a.e. x and the above 
implies that |x — y| = A for /r-a.e. x and therefore P' = P. As a result, 
P'{f) = —oo for all P / P' G AI(/x, u) and 

sup P^'[/]=P^[/] = -l. 

We now turn to the dual problem; since /U <c is irreducible, the quasi- 
sure formulation is equivalent to the pointwise one. Let (p, h be Borel 
functions such that 


ip{x)+ 'ip{y) + h{x){y - x) > f{x,y) for all {x,y) £ I x J; 


then in particular 

ip{x) + 'ijj{x + (5) + h{x)5 > 0 for all x £ (0,1), 6 £ [0, A), 
ip{x) + 'ijj{x — 6) — h{x)5 > 0 for all x £ (0,1), <5 £ [0, A). 


Adding these two inequalities yields 

> 0 f„ra]l xe(0,i), ie|o,A), 

Let e > 0. As in Example |8.H Lusin’s theorem can be used to find a set 
A C (0,1) with A(A) > 1 — £ such that for all x G A there exists a sequence 
= ^n{x) with ip{x ± 5n) —t '>p{x ± A). Thus, passing to the limit in the 
above inenualitv shows that 


> 0 for all X G A, 


above inequality shows that 

ii{x - A) + il){x + A) 

^\.x) +-^- 

and as e > 0 was arbitrary, the inequality holds ^u-a.e. But then 

y.{ip)+v{'ip) = P[ip{X) + '4j{Y)] = J ip{x) + — - + + 

As a result, the dual value is zero and a dual optimizer is given for instance 
by c? = P = /i = 0. 
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